Database modernization from Db2 and Oracle tables with LOB datatypes to Azure SQL PaaS offering (DB / MI) or SQL Server needs careful planning and at the same time provides an opportunity to optimize your database. Having TBs or PBs of LOB data stored as a part of any database has space implications as well as other maintenance overheads.
In this technical blog, we are discussing the approach of storing the LOB data as Azure Datalake storage blob object and location of the blob object in Azure SQL DB Table. This approach allows reduced time to perform maintenance, provides additional flexibilities on backup / DR / storage tier and significantly reduces the size of the SQL DB. This blog references Db2 z/OS as source but the same approach can be extended to other Db2 flavors or other databases like Oracle.
This blog outlines how to copy BLOB data from Db2 z/OS database (same concept applies for Db2 LUW, Db2 i, Oracle, etc.) to Azure using Azure Data Factory and LogicApp. The approach outlined here is low-code/no-code solution with ease-of-maintenance.
To identify the compatible SQL target database, you can use conduct an assessment using SSMA for Db2. Once the SQL database is determined, SSMA can also be used to convert schema and other database objects. A step-by-step guidance is here.
Azure Data Factory is a managed cloud service that's built for these complex hybrid extract-transform-load (ETL), extract-load-transform (ELT), and data integration projects. It has Db2 connector and allows bulk copy.
Azure Logic Apps is integration PaaS for automating workflows.
In our solution, we have considered ADF Copy activity with db2connector, Db2 as source and Azure SQL as Sink. ADF performs bulk copy so getting data to Azure SQL will be much quicker, once data is present in Azure SQL, serverless solution LogicApp performs blob object creation and metadata population.
This solution helps in transferring LOB data residing in Db2 table to ADLS Gen2 storage account and update blob metadata information to Azure SQL DB. This approach allows to store LOB data in Azure storage account and reference the same from Azure SQL database via blob object metadata. For data tier modernization projects Db2 will usually be present in on-premises environment. ADF provides an ability for bulk copies and an ability to parallelize the copy to Azure.
For easiness we have divided this into 2 stages.
1st stage is to copy Db2 table data to Azure SQL using ADF.
2nd stage is to copy data from Azure SQL to Azure Datalake Storage and metadata in Azure SQL.
Setup Azure Data Factory Self Hosted Integration Runtime (SHIR) on machine which has access to Db2 / On-Prem database.
Create Azure SQL Database.
Create Linked Service for Db2.
Create Link Service for Azure SQL DB.
Create ADF Pipeline with Copy activity which has Db2 as source and Azure SQL DB as sink.
Create a LogicApp workflow to read Azure SQL DB and create a blob in Azure Storage Account.
Update blob object metadata information to Azure SQL DB.
Below diagram describes high level architecture of the approach.
High level architecture to Copy Data from Db2 to Azure
Components present in above diagram:
Azure data factory Db2 connector can be used to create linked service to Db2 LUW with details Db2 database hostname, port number, Mainframe RACF user id and password. Detailed instructions for the same can be found here.
Azure data factory Azure SQL DB connector can be used to create linked service to SQL DB. Instructions for the same can be found here.
CREATE TABLE "dba1"."BLOB_TABLE " (
"ID" INTEGER WITH DEFAULT NULL,
"PK_DB2" INTEGER NOT NULL,
"IMAGEBLOB" BLOB(50000) WITH DEFAULT NULL,
CONSTRAINT "BLOBTABLE_PK" PRIMARY KEY
("PK_DB2")
)
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[BLOB_TABLE ](
[IDENTITY_ID] [int] IDENTITY(1,1) NOT NULL,
[PART_ID] [int] NOT NULL,
[DB2_PK] [int] NOT NULL,
[IMG_BLOB_CONTENT] [varbinary](max) NULL
) ON [PRIMARY]
GO
Alternatively, we can also use same source table schema with 2 additional columns to hold Path and Image Blob Id.
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[BLOB_Metadata](
[IDENTITY_COL] [int] IDENTITY(1,1) NOT NULL,
[DB2_PARTID] [int] NULL,
[DB2_PK] [int] NULL,
[PATH] [varchar](200) NULL,
[IMAGE_blob_ID] [varchar](200) NULL
) ON [PRIMARY]
GO
In ADF pipeline, included Source as Db2 and use the linked service created in previous steps.
If source data is huge then we might want to use Lookup table approach and invoke parallel copies.
In this case, we have considered to copy entire Db2 to Table to Azure SQL in a single copy activity.
Mapping of Db2 columns to Azure SQL table columns.
Below is the LogicApp setup created for testing.
Components present in above diagram:
LogicApp has multiple triggers like Recurrence trigger (at certain frequency), File watcher and HTTP request end point trigger. In this case we have utilized HTTP Request end point trigger as we need to integrate it with ADF.
On creation of “When a HTTP request is received” activity, a HTTP POST URL is generated. Copy this HOST POST URL and we will update this in ADF Invoke Logicapp activity at later end of this document.
In our case, we are not passing any parameters so JSON schema is very simple.
If we want to pass any additional payload in HTTP request the same needs to be added accordingly.
{
"properties": {
"body": {
"properties": {},
"type": "object"
},
"headers": {
"properties": {
"Content-Type": {
"type": "string"
}
},
"type": "object"
},
"method": {
"type": "string"
},
"url": {
"type": "string"
}
},
"type": "object"
}
In this activity, we are reading the content of BLOB_TABLE from Azure SQL DB. Alternatively we can use Get Rows activity also.
In our case as entire BLOB_TABLE is in scope, we have given ‘select * from dbo.BLOB_TABLE;’
Once we fetch data from SQL table for every row (LOB data type in the row) we need to create an Azure storage blob object, so we need to loop it through for all the records that are fetched.
Under “Select an output from previous steps”,
In For Each, add an action named “Create Blob”.
Here we need to perform below steps.
Once above steps are executed, we should be seeing a blob object in Azure storage account.
In this step we are creating a blob object metadata table so application can refer to the metadata information and fetch the image from Azure Storage.
With in For Each block, select SQL server activity with “Insert row’ action. Once its Add an activity called “Insert row”.
These mapping are provided for reference, based on application need we should be able to add more variables to metadata table and populate appropriate information.
With this current setup, we should be able to manually execute ADF Copy activity and LogicApp workflow. In this section, want to provide more information on how to integrate ADF with LogicApp in HTTP / API method. Detailed documentation on how to invoke Logicapp via HTTP is available in Microsoft Docs here.
From LogicApp workflow, copy HTTP POST URL of “When a HTTP request is received” activity.
In ADF, attach a web activity to existing Copy Activity and follow below steps.
With this if we save and debug / trigger, we should be able to invoke Logicapp automatically after ADF Copy activity is complete.
Input Db2 table has 25 rows as shown below. For testing purposes, included same BLOB content across all the 25 rows.
Now Execute ADF pipeline. For testing, here we are using “Trigger Now” in ADF. Based on business requirement, Trigger can be Schedule based, Storage Event based or a Custom Event.
Once ADF pipeline is executed, we should be able to see the status in “Monitor” section.
Now look at BLOB_TABLE table on Azure. As a next step let us validate Logic App.
LOB data in Azure Data Lake Containers
These decisions need to be taken by considering the source data structure, partitioning and SHIR capacity.
If you have feedback or suggestions for improving this data migration asset, please contact the Databases SQL Ninja Engineering Team (datasqlninja@microsoft.com). Thanks for your support!
Note: For additional information about migrating various source databases to Azure, see the Azure Database Migration Guide.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.